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Abstract 

The global macroscopic behaviour of a dynamical system is encoded in the eigen- 
functions of a certain transfer operator associated to it. For systems with low dimen- 
sional long term dynamics, efficient techniques exist for a numerical approximation 
of the most important eigenfunctions, cf. [7\. They are based on a projection of the 
operator onto a space of piecewise constant functions supported on a neighborhood 
of the attractor - Ulam's method. 

In this paper we develop a numerical technique which makes Ulam's approach 
applicable to systems with higher dimensional long term dynamics. It is based 
on ideas for the treatment of higher dimensional partial differential equations us- 
ing sparse grids [3TJ |2]. We develop the technique, establish statements about its 
complexity and convergence and present two numerical examples. 

1 Introduction 

Recently, numerical techniques have been developed which enable a coarse grained, yet 
global statistical analysis of the long term behaviour of certain dynamical systems. The 
basic algorithmic approach is to construct a box covering of some set of interest in phase 
space (e.g. the attractor of the system) [U |5]. The cells in this covering then constitute 
the states of a finite Markov chain. The transition matrix of this chain (i.e. the matrix 
of transition probabilities between the boxes) can be viewed as a finite approximation 
to the transfer (or Frobenius-Perron) operator of the system. This operator describes 
how probability distributions on phase space evolve according to the dynamical system 
under consideration. In certain cases and in the appropriate functional analytic setting, 
eigenmodes of this operater can be used to charaterize the long term behaviour of the 
dynamics. Certain stationary distributions of the operator characterize how frequently 
typical trajectories visit certain parts of phase space. Eigenmodes at roots of unity enable 
the detection of macroscopic cycles in the dynamics and eigenmodes at real eigenvalues 
close to one yield a decomposition of phase space into almost invariant sets, i.e. sets 
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for which the probability for a typical point to be mapped back into the set is large 
[?]. The latter concept has e.g. been used in order to detect and compute biomolecular 
conformations, cf . ESI ES EH [TO] . 

Formally, the construction of the Markov chain can be viewed as projecting the transfer 
operator onto the space of functions which are piecewise constant on the elements of 
the box covering. Ulam conjectured [30] that for maps on the interval, the stationary 
distribution of the chain converges to an invariant density (i.e. a stationary distribution) 
of the map. This has been proved for certain expanding maps by Li [25] and since 
then for various special classes of maps or stochastic processes also in higher dimensions 

[El EH US [131 EH [7]. 

Ulam's method in combination with the subdivision approach from [H [3] for the com- 
putation of the box covering works fine for systems with a low dimensional attractor, cf. 
also [3j [8]. For systems with higher dimensional long term dynamics the approach be- 
comes inefficient due to the curse of dimension: the number of boxes in the covering scales 
exponentially in the dimension of the attractor. Adaptive approaches to the construction 
of the box covering [01 EI] do not remedy this fact. 

In this paper we propose to attack this discretization task using ideas from sparse 
grids [291 ED E]- In this approach, which is e.g. being used in the numerical solution 
of partial differential equations on higher dimensional domains, a basis of [0, l] d is build 
from a hierarchical basis of [0, 1] via a tensor product construction. The entire basis can 
be decomposed into subspaces which are spanned by basis functions of the same level of 
the Id hierarchy in each factor. To each subspace one can associate its approximation 
benefit and its cost (which is typically given by its dimension). The idea of the sparse 
grid approach is to assemble a finite dimensional approximation space by choosing only 
those subspaces whith the highest benefit to cost ratio. 

In order to discretize the Frobenius-Perron operator, we employ a piecewise constant 
sparse hierarchical tensor basis (i.e. using the Haar system as the underlying Id basis). 
This basis provides an approximation error of 0{n~ x ■ (log v^)^ 1 ) f° r functions with 
bounded first derivatives, requiring a computational effort of 0(n ■ (log ^yn) d ~ 1 ) (where 
n denotes the number of degrees of freedom in one coordinate direction and d is the 
dimension of phase space). In comparison, the standard Ulam basis requires 0(n d ) basis 
functions in order to obtain an approximation error of O^n^ 1 ). 

The paper is structured as follows: in Section 2.1 we collect relevant basic concepts 
from dynamical systems theory, in particular Ulam's method. In Section [3] we develop 
the Sparse Ulam method by constructing the hierarchical tensor basis, deriving approxi- 
mation properties, outlining the construction of the optimal approximation subspace and 
comparing cost and accuracy of the new method with the standard Ulam approach. The 
section closes with statements about the convergence properties. In Section [4] we collect 
considerations concerning an efficient implementation of our approach. In particular, we 
derive estimates on the computational effort as a function of the required accuracy. Sec- 
tion [5] presents two numerical examples: a comparison with Ulam's method for a three 
dimensional map with a smooth invariant density and a computation of the leading eigen- 
functions of the transfer operator for a four-dimensional map, constructed via a tensor 
product from two two-dimensional standard maps. 
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Our implementation of the Sparse Ulam method as well as the code for the example 
computations is freely available from the homepage of the authors. 



2 Transfer operators and Ulam's method 

2.1 Long term dynamics and the Frobenius-Perron operator 

Let S : X — > X, X C M. d , be a discrete dynamical system which is measureable w.r.t 
the Borel-a-algebra B on X. Let Aic be the set of all bounded complex valued measures 
on (X, B) and M. C M.c be the subset of probability measures. The Frobenius-Perron 
operator (or transfer operator) P : A4c — > Mc, 

Pfi = fioS~ 1 ) (2.1) 

describes how (probability) measures on phase space evolve according to the dynamics 
defined by S. A measure is called invariant if it is a fixed point of P. A set A C X is 
called invariant if A = S^ 1 (A). An invariant probability measure fx is ergodic if every 
invariant set has either full or zero /x-measure. Birkhoff's ergodic theorem [TJ states that 
ergodic measures characterize the long time behaviour of the system: Let fx be ergodic 
and if : X — ► M be a /i-integrable observable, then 



1 n_1 r 

lim -^(S fe (*))= / ^d/i (2.2) 



for /x- almost all a; G X . 

Definition 2.1: A probability measure \x is called S'iii? measure or natural invariant 



measure if (2.2) holds for continuous observables if and all points x G £/ in a set U C X 
with positive Lebesgue measure. 

SRB measures are defined via a property which we would like them to have. But how 



does one see whether a measure is SRB? After all, equation (2.2) is not easy to check in 
general. On the other hand, if fx is an ergodic measure which is absolutely continuous 
w.r.t. the Lebesgue-measure m, i.e. if there is a density f with fi(A) = j A f dm for all 
measurable A, then fx is SRB. 



Using (2.1), we can directly define the Frobenius-Perron operator on Lebesgue inte- 



grable functions / : X — > C: 



Pf dm = / f dm VA G B. (2.3) 

A Js- 1 {A) 



If S is differentiable, we obtain the explicit expression 

m 



" \ D S(y)\' 
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2.2 Almost invariance 



Invariant measures (or densities) are fixed points of the Frobenius-Perron operator, i.e. 
eigenmeasures resp. -functions at the eigenvalue 1. Eigenvectors at eigenvalues close to 
one are related to almost invariant sets: Intuitively, an almost invariant set of S is a 
subset A C X such that the invariance ratio 

m(S-\A)nA) 

Pm{A) := —r- 

m(A) 

is close to 1, i.e. a point which is chosen randomly from A with respect to the measure m 
maps into A with high probability. More precisely, we say that 

Definition 2.2: A subset A C X is g- almost-invariant w.r.t. the probability measure \x 
if n(A) ^ and 

PM) = Q- ( 2 - 4 ) 
Let v e Aic be an eigenmeasure of P at an eigenvalue A ^ 1. Since \v(X) = 
(Pv)(X) = u(S- 1 (X)) = v(X), it follows that v{X) = 0. In particular, if A < 1 and v 
are real, then there are two positive real measures u + , v~ with disjoint supports such that 
v = v + — v~ (Hahn-Jordan decomposition). The following theorem relates the invariance 
ratios of the supports of v + and v~ to the eigenvalue A. 

Theorem 2.3: [7] Let v be a normalize^ eigenmeasure of P at the real eigenvalue A < 1. 
Then 

p wl (A + )+p H (A-) = X + l } (2.5) 
where A + = suppz/ + and A~ = suppz/~. 



2.3 Ulam's method 

In order to approximate the (most important) eigenfunctions of the Frobenius-Perron 
operator, we have to discretize the corresponding infinite dimensional eigenproblem. Ulam 
[30] proposed to project the L 1 eigenvalue problem Pf = Xf into a finite dimensional 
subspace of piecewise constant functions: Let (V^) n6 N be a sequence of approximation 
subspaces of L 1 with dim V n — n and let Q n : L 1 — >• V n be corresponding projections into 
V n . The sequences (V n ) and (Q n ) should be chosen such that Q n converges pointwise to 
the identity on L 1 . We define the discretised Frobenius-Perron operator as 

P n '■= QnP \Vn ■ 

We now choose the approximation spaces to be spanned by piecewise constant func- 
tions. To this end, let X n = {Xi, . . . ,X n } be a disjoint partition of X with m(Xi) — > 
as n — > oo. Define V n := spanjxi, • • • , Xn}, where Xi denotes the characteristic function 
ofXi. 

Further, let 

n If 

Q n h:=y^CiXi with Ci := —— / hdm, 

'i-e- W\(X) = 1. 
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yielding -P n A+ — an d Pn^-n Q A n , where A n := {h G V n : f \h\ dm = l} and 
A^ := {h G A n : h > 0} . Due to Brouwer's fixed point theorem there always exists an 
approximative invariant density /„ = P n f n G A+. The matrix representation of the linear 
map P n : A n — > A n w.r.t. the basis of characteristic functions is given by the transition 
matrix with entries 

m(AA 

Ulam conjectured [30] that if P has a unique stationary density /, then a sequence (f n )neN 
converges to / in L 1 . It is still an open question under which conditions on S this is true in 
general. Li [23] proved the conjecture for expanding, piecewise continuous interval maps, 
Ding an Zhou [13] for the corresponding multidimensional case. 

In [7], Ulam's method was applied to a small random perturbation of S which might 
be chosen such that the corresponding transfer operator is compact on L 2 . In this case, 
perturbation results [23] (section IV.3.5.) for the spectrum of compact operators imply 
convergence. 

2.4 Computing the transition matrix 



The computation of one matrix entry (2.6) requires a ci-dimensional quadrature. A stan- 



dard approach to this is Monte-Carlo quadrature (also cf. [20J ) , i.e. 

1 - 

^TrE*^*))' ( 2 - 7 ) 

k=l 

where the points x%, . . . , xk are chosen i.i.d from Xj according to a uniform distribution. 
In [19], a recursive exhaustion technique has been developed in order to compute the 
entries to a prescribed accuracy. However, this approach relies on the availability of local 
Lipschitz estimates on S which might not be cheaply computable in the case that S is 
given as the time-T-map of a differential equation. 

For the Monte-Carlo technique, consider a uniform partition of the unit cube into 
M d congruent cubes of edge length 1/M. Let Pm denote the transition matrix for this 
partition and let Pm be its Monte-Carlo approximation. According to the central limit 
theorem (and its error-estimate, the Berry- Esseen theorem |14| ) we havd^ 

\Pij-Pij\ £ l/V^ (2.8) 
for the absolute error of the entries of Pm- As a consequence, we need 

M d 

sample points in total in order to achieve an absolute error of less than TOL for all the 
entries pij. Note that the accuracy of the entries of Pm imposes a restriction on the 
achievable accuracy of the eigenvectors of Pm- 

2 We write a(K) < b(K) if there is a constant c > independet of K such that a(K) < cb(K). 
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3 The Sparse Ulam method 



A naive application of Ulam's method to higher dimensional systems suffers from the curse 
of dimension: in order to achieve an L 1 -accuracy of 0{e) one needs an approximation 
space of dimension O (£~ d ) - translating into a prohibitively large computational effort 
for higher dimensional systems. There is a remedy to this problem for systems with 
low dimensional long term dynamics [7j: the idea is to first compute a covering of 
the attractor of the system. On this (low dimensional) covering, Ulam's method can 
successfully be applied. 

To avoid the exponential growth of complexity in the system (or attractor) dimension, 
we now follow an idea which was originally developed for quadrature problems [29J and 
used for the treatment of higher dimensional partial differential equations, cf. for exam- 
ple (3TJ E] : sparse grids. In fact, we change from the standard Ulam basis to a sparse 
hierarchical one in order to obtain a better cost /accuracy relation. In the following, we 
discuss the chosen basis in detail, as well its advantages and disadvantages. 



3.1 The Haar basis 

We describe the Haar basis on the (i-dimensional unit cube [0, l] d , deriving the multidi- 
mensional basis functions from the one dimensional ones, see e.g. [18J. Let 



/HaarOr) = -sign(x) • (\x\ < 1), (3.1) 

where (\x\ < 1) equals 1, if the inequality is true, otherwise 0. A basis function of the 
Haar basis is defined by the two parameters level i and center (point) j: 

ftAX) ^ { 2^ ■ /Haar (2* (x ~ Xy)) if i > 1, ^ 

where 

x id := (2j + l)/2\ j E {0, . . . , T- 1 - 1}. (3.3) 

A d-dimensional basis function is constructed from the one dimensional ones using a tensor 
product construction: 

d 

( Pl,i( x ) : =Y[ft i ,ii( x i)> ( 3 - 4 ) 

i=l 

for x = (x 1 ,...,x d ) e [0,l] d . Here I = (£i,...,£ d ), £ { e {0,1,2,...}, denotes the level of 
the basis function and j = ( j 1; . . . , j i G {0, . . . , 2 £i — 1}, its center. 

Theorem 3.1 (Haar basis): The set 

^={/mMgN ,jG{0,...,2 1 -1}} 
is an orthonormal basis of L 2 ([0, 1]), the Haar basis. Similarly, the set 

^={ % |le<j t G{0,...,2^-l}} 
is an orthonormal basis of L 2 ([0, l] d ). 



6 



Level 



Level 1 



Level 2 



1/4 1/2 



3/4 1 



Figure 1: First three levels of the ID Haar basis 

Figure [I] shows the basis functions of the first three levels of the one dimensional Haar 
basis. It will prove useful to collect all basis functions of one level in one subspace: 

Wi:=spaa{^j | j f G {0, . . . ,2* - 1}} , I G Nq. (3.5) 

Consequently, L 2 = L 2 ([0, l] d ) can be written as the infinite direct sum of the subspaces 
W t , 

L 2 = W t . (3.6) 

In fact, it can also be shown that L 1 = L 1 ([0,l] d ) = ® teN d as well. To see this, 
note that <eI jW^ with 1 — {£ \ || ^ || ^ < n} is the space of characteristic functions 
supported on the uniform decomposition of the unit cube in 2 n subcubes in every direction. 
Moreover, we have 

d 

dim W t = Yl 2 max {°^- 1 > = 2^° ei '\ (3.7) 

i=l 

In order to get a finite dimensional approximation space most appropriate for our pur- 
poses, we are going the choose an optimal finite subset of the basis functions (ft,y Since in 
general we do not have any a priori information about the function to be approximated, 
and since all basis functions in one subspace We deliver the same contribution to the 
approximation error we will use either all or none of them. In other words, the choice for 
the approximation space is transferred to the level of subspaces We. 

3.2 Approximation properties 

The choice of the optimal set of subspaces We relies in the contribution of each of these 
to the approximation error. The following statements give estimates on this. 

Lemma 3.2: Let / G C 1 ([0, 1]) and let cy be its coefficients with respect to the Haar 
basis, i.e. / = c i,jfi,j- Then for i > and all j 

M <2-^ i ||/'|| 00 . 
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For f E C 1 ([0, l] d ) we analogously have for £ ^ and all j 

k J i<2-( £ ^ M ' +i ) /a n wii- 

Proof. For z > 1 

2^Cy = / / - / / 



fix,) + / f)dx- ' /fo) + / f \dx 



and thus 



2~ |%| < 211/Hoo / xdx 
Jo 



which yields the claimed estimate for the ID case. The bound in the ci-dimensional case 
follows similarly. □ 

Using this bound on the contribution of a single basis function to the approximation 
of a given function /, we can derive a bound on the total contribution of a subspace We- 
For // e W t 

\\h\\v < 2-^^ +1) nil^/IU, (3.8) 
\\fi\\» < 2-^o ( ^ +3)/2 nil^/lloo. (3.9) 

3.3 The optimal subspace 

The main idea of the sparse grid approach is to choose cost and (approximation) benefit 
of the approximation subspace in an optimal way. We briefly sketch this idea here, for a 
detailed exposition see [3H [2] . For a set I C Nq of multiindices we define 



iel 

Correspondingly, for / e L 1 , let fi = Yuieifh wnere ft is the orthogonal projection of / 
onto Wt. We define the cost C(£) of a subspace W( as its dimension, 

C(£) = dimW e = 2 E ^° A ~ 1 . 

Since 

ii/ -mi < Ew = Ewi-EW' ( 3 - 10 ) 



the guaranteed increase in accuracy is bounded by the contribution of a subspace We 
which we add to the approximation space. We therefore define the benefit B{£) of We as 
the upper bound on its Li-contribution as derived above, 



B{£) 



(3-11) 



Note that we omited the factor involving derivatives of /. The reason is that it does not 
affect the solution of the optimization problem ( |3.12 ) 



max B(T) 

C(I)<c 



Let C(I) = ^2i e iC(£) and B(I) = X^ 6l -B(^) be the total cost and the total benefit 
of the approximation space Wj. In order to find the optimal approximation space we are 
now solving the following optimization problem: Given a bound c > on the total cost, 
find an approximation space W\ which solves 

(3.12) 

(3.13) 
£f\ Using the 

(3.14) 



One can show (cf. [2]) that I C Nq is an optimal solution to (3.12) iff 

C{£) 



B{£) 



const for £ E dl, 



where the boundary dl is given by dl — {£ E I | I' E I, I' > » 
definitions for cost and benefit as introduced above, we obtain 



C{£) 
B{£) 



-E^^o(^+ 1 ) 



> 2 ^ 



2 2\i\ 



where \£\ means the 1-norm of the vector 



The optimality condition (3.13) thus translates into the simple condition 

\£\ = const for £ E dl. 

As a result, the optimal approximation space is Wj(n) with 

I(N) = {£ e N d | \£\ < N] , 

where the level N = N(c) E N is depending on the chosen cost bound c. Figure [2 
schematically shows the basis functions of the optimal subspace in 2D for N = 3. 

Remark 3.3: Because of the orthogonality of the Haar-basis in L 2 one can take the 



(3.15) 
(3.16) 



squared contribution as the benefit in the L 2 -case (resulting in equality in (3.10)). In this 
case we obtain the optimality condition 

J2(£i + 1) = const for£G<9I (3.17) 

and correspondingly W\ with 



I(N) 



E Nq : + 1 )< N \ 



(3.18) 



N = N(c), as the optimal approximation space. 
£' > I is meant componentwise 
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Figure 2: 3 level sparse basis in 2D. Shaded means value 1, white means value — 1, 
thicker lines are support boundaries. 



3.4 The discretized operator 

Having chosen the optimal approximation space Vn = Wi(n) we now build the corre- 
sponding discretized Frobenius- Perron operator Pn- Since the sparse basis 

B N := {<pe,i I \t\ < N, j, £ {0, . . . , 2* - 1}} (3.19) 

is an L 2 -orthogonal basis of Vn, the natural projection Qn '■ L 2 — > Vn is given by 

Q"f=J2([ f<A V- (3-20) 

All basis functions ip G Bn are piecewise constant and have compact support, so Qn is 
well defined on L 1 as well. Choosing an arbitrary enumeration, the (transition) matrix of 
the discretized Frobenius-Perron operator 

Pn = Qn o P 

with respect to Bn has entries 

Pij = (ft P(fj. (3.21) 



Writing tpi = (pf — ip~ = \cpi\ ■ (xt ~Xi), where \<fi\ is the (constant) absolute value of the 
function over its support and xt an d Xi are the characteristic functions on the supports 
of the positive and negative parts of tpi, we obtain 

Pij = \<Pi\\<Pj\ (J Xt Pxt - J Xi Pxf - / Xi Pxj + I Xi Px]^j , (3.22) 
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which is, by (2.6) 



Pi, = \<Pi\\<Pi\ J2 ±m i X t n S ~* ( X t)) , (3-23) 

where X i = supp cpf and we add the 4 summands like in ( |3.22[ ) . These can be computed 
in the same way as presented in section [2j 

Remark 3.4: We note that 

(a) if the i th basis function is the one corresponding to £ = (0, . . . , 0), then 

Pij — 

(b) The entries of are bounded via 



(c) If Ppjx = Xx with A 7^ 1, then Xj = if the i th basis function is the one corresponding 
to I = (0, . . . , 0). This follows from 



(a) 



[ej Pn)x = CiXx = Xxi. (3.24) 



It is straightforward to show that this property is shared by every Ulam type projec- 
tion method with a constant function as element of the basis of the approximation 
space. This observation is useful for the reliable computation of an eigenvector at 



an eigenvalue close to one (since it is badly conditioned): (3.24) allows us to reduce 



the eigenproblem to the subspace orthogonal to the constant function. 

With the given change in (c) are properties (a)-(c) valid for the numerical realisation as 
well. 

3.5 Convergence 



As has been pointed out in the Introduction and in Section 2.3, statements about the 
convergence of Ulam's method exist in certain cases. Note that for iV = kd, k = 0, 1, 2, . . ., 
the approximation space Wi(jv) includes the Ulam approximation space We with £ = 
(k, . . . , k) and thus we obtain convergence of the Sparse Ulam method as a corollary to 
the convergence of Ulam's method in these cases from the following Lemma (which can 
be proved by standard arguments). An open question is, if in general, the convergence of 
Ulam's method implies convergence of Sparse Ulam. 

Lemma 3.5: \\Q N f - f\\ Ll ™ for / G L\ 

4 Complexity 

In this section, we collect basic statements about the complexity of both methods. 
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4.1 Cost and accuracy 

We defined the total cost of an approximation space as its dimension and the accuracy via 



its contribution or benefit, see (3.11). In this section we derive a recurrence formula for 



these numbers, depending on the level of the optimal subspaces and the system dimension. 
Let C(N, d) be the dimension of Wi(m in phase space dimension d. Then 



N 



C(N, d) = C(N, d - 1) + C ( N ~k,d- 1)2* -1 , 



(4.1) 



fc=i 



since if i = (*,...,*, 0), then the last dimension plays no role in the number of basis 
functions, and the total number of basis function's for such £'s is C(N, d — 1). If on the 



other hand 



*, id) with i d > 0, then the number of basis functions with such ts 



is C(N — id, d — l)2 £d ~ 1 , because there are 2 £d ~ l one-dimensional basis functions of level 
id possible for the tensor product in the last dimension. For d = 1 we simply deal with 
the standard Haar basis, so C(N, 1) = 2 N . 

Lemma 4.1: 

where = means the leading order term in N. 

Proof. By induction on d. The claim holds clearly for d — 1. Assume, it holds for d — 1. 



By considering the recurrence formula (4.1), we see that C(N,d) = p(N) 2 , where p is 



a polynomial of order less or equal to d. Consequentely, 



C{N, d) 



N' 



d-2 nN-d+2 



N 



(d-2)! 
(d-2)\ 

j\rd—i 2 N - d +i 

(d-iy. 



2N-d+l 



(d-2)! 



N 



d-2 



+ 



2N-d+l ftfd-1 

(d-2)\d- 1 



□ 



According to (3.10), the approximation error ||/ — fi\\ is bounded by Yltei 11/^11' i- e - 



II/- All < E Wft 



\£\>N 



if we use the optimal approximation space Wum- By (3.8) this means 



n/ 



All £ E 

\£\>N 



n \wi 
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Again, the constants HI^o ll^j/lloo on ly depend on the function to be approximated. 
Thus, without a priori knowledge about / we need to assume that they can be bounded 
by some common constant and accordingly define the discretization error of the N th level 
sparse basis as 

E(N,d) = J2 2~ E ^° (4+1) . (4.3) 

\£\>N 

Let E(—n, d) for n G N, n > represent the error of the empty basis and £ = (£, id) with 
I e Nq -1 . Then 

E(N,d) = 2" E ^ o(4+1) 
\e\>N 

oo 

^=o \e\>N-e d 

oo 

= ^2-^ +1 )^°) J E(iV-£ d ,rf-l), 
e d =o 

where the expression 7^ 0) has the value 1, if it is true, otherwise 0. This leads, by 
splitting the sum, to the recurrence formula 

N oo 

E{N,d)=E{N,d-l) + J2E(N-k,d-l)2- k - 1 + ^'^(-1^-1). (4.4) 

k=l k=N+l 



v 

=2- JV - 1 E(-l,ci-l) 

We easily compute that £(iV, 1) = 2" JV - 1 for iV > and E(-l, d) = (3/2) d . 
Lemma 4.2: 

where, again, = means the leading order term in N. 

Proof. By induction on d. The claim holds for d — 1, assume it holds for — 1. Then 

(d-2)! ' ^ (d^2)T 

j\jd-22~N-d+l 2" 



= m-9v +E- hf=™ 2 +i 



fc=l 

-N-d N 



-N-l 



2~N-d jyd-1 

(d-2)\d- 1 



□ 
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Comparison with Ulam's method. We now compare the expressions for the asymp- 
totic behaviour of cost and discretization error in dependence of the discretization level 



N and the problem dimension d in Lemmata 4.1 and 4.2 to the corresponding expressions 



for the standard Ulam basis, i.e. the span of the characteristic functions on a uniform 
partition of the unit cube into cubes of edge length 2~ M in each coordinate direction - 
this is <M We. This space consists of {2 M ) d basis functions, the discretization error 

is o(2~ M y 

We thus have - up to constants - the following asymptotic expressions for cost and 
error of the sparse and the standard basis: 





cost 


error 


sparse basis 


(N/2) d - 1 2 N 


(N/2) d ~ 1 2~ N 


standard basis 


2<im 


2 -M 



To highlight the main difference, consider the following simple computation: The 
expressions for the errors are equal if 

M = N + d- (d- l)log 2 N. 

Using this value for M in the cost expression we get N d ~ l 2 N ~ d < 2 dN+d2 ^ d( - d ^^ lo ^ N ^ j. e . 

N 



d + 1 



>log 2 iV-l (4.6) 



as a sufficient condition for the sparse basis to be more efficient than the standard basis. 
Since we neglected constants and lower order terms in this estimate, the only conclusion 
we can draw from this is that from a certain accuracy requirement on, the sparse basis is 
more efficient than the standard one. 



4.2 Computing the matrix entries 

When we use Monte-Carlo quadrature in order to approximate the entries of the transition 
matrix in both methods, the overall computation breaks down into the following three 
steps: 

1. mapping the sample points, 

2. constructing the transition matrix, 

3. solving the eigenproblem. 

While steps 1. and 3. are identical for both methods, step 2. differs significantly. This 
is due to the fact that in contrast to Ulam's method, the basis functions of the sparse 
hierarchical tensor basis have global and non-disjoint supports. 
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4.2.1 Number of sample points 



Applying Monte-Carlo approximation to (3.22), we obtain 



i k=l 



m{Xj) K 



K 



k=l 



where the sample points xf are chosen i.i.d. from a uniform distribution on Xj~ and X~ , 
respectively. In fact, since the union of the supports of the basis functions in one subspace 
We covers all of X, we can reuse the same set of g sample points and their images for each 
of the subspaces We (i.e. ( N ^ d ) times). Note that the number Kj of test points chosen 
in Xp now varies with j since the supports of the various basis functions are of different 
size: on average, Kj = gm(Xp). Accordingly, for the absolute error of p^ we get 

m{Xj) 1 

\Pij - p i:i \ v 3 J— = , (4.9) 

yJm{Xi)m{Xj)^/ g m(Xj) ^ gm(Xi) 

where we used that m(Xf) ~ m(Xi). In the worst case we thus get 

2 N/2 



\Pij Pij | 



which implies 



Q * T^r-r (4-10) 



2 N 

TOL 2 

for the total number of test points required in order to achieve an accuracy of TOL in 
the entries of the transition matrix. 



Comparison with Ulam's method. Aiming at a final accuracy of e > of the 
eigenvector, we have to choose M and accordingly. Assuming that the corresponding 
eigenproblems are well conditioned, TOL = e is a reasonable choice for the required 
accuracy of the entries. This implies a number of 



Q ^ £ 



-(d+2) 



sample points for the standard realisation of Ulam's method (cf. 2.4), and yields, since 

C>s-* (logie-^y- 1 

sample points for the sparse Ulam method. Note that for d > 2, the sparse Ulam method 
requires less sample points than Ulam's method in order to achieve a comparable accuracy 
in the eigenvector approximation. 
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4.2.2 Number of index computations 

While in Ulam's method each sample point is used in the computation of one entry of 
the transition matrix only, this is not the case in the Sparse Ulam method. In fact, each 
sample point (and its image) is used in the computation of |I(iV)| 2 matrix entries, namely 
one entry for each pair (We, W m ) of subspaces. 

Correspondingly, for each sample point x (and its image) and for each £ e I(N), we 
have to compute the index j of the basis function tpg^ e Wg whose support contains x. 

Since (cf. the previous section) the required number of sample points is O (^ t 2 q L2 j and 
\I(N)\ = ( N + d ), this leads to 

IT/*™ 2 N (N + d\ ^ 1 N d nN 2 d ~ 1 N 1 

of these computations (for reasonable d). In contrast, in Ulam's method, the correspond- 
ing number is 

2<2M y 

Q - 1 = TOL> = TOLi dimVM - 
Note that for the Sparse Ulam method the number of index computations is not staying 
proportional to the dimension of the approximation space. However, it is still scaling 
much more mildly with d than for Ulam's method. 

4.2.3 Occupancy of the transition matrix 

The matrix which represents the discretized transfer operator in Ulam's method is sparse: 
the supports of the basis functions are disjoint, and thus Pij ^ only if S(Xj) n X { ^ 0. 
Hence, for a sufficiently fine partition, the number of partition elements X,i which are 
intersected by the image S(Xj) is determined by the local expansion of S. This is a 
fixed number related to a Lipschitz estimate on S and so the matrix of the discretized 
transfer operator with respect to the standard Ulam basis is sparse for sufficiently large 
n. Unfortunately this property is not shared by the matrix with respect to the sparse 
basis as the following considerations show. 

The main reason for this is that the supports of the basis functions in the sparse basis 
are not localised, cf. the thin and long supports of the basis of We for £ = (N, 0, . . . , 0). 
This means that the occupancy of the transition matrix strongly depends on the global 
behaviour of the dynamical system S. Let B e := {<Pe,j | jj G {0, . . . , 2 li — 1}} denote the 
basis of We and let 

nnz(k,f) = \{(i,j) \ S , (supp(v? i )) D supp(</? j ) ^1,^ 6 B k , ipj G B e }\ 

be the number of nonzero matrix entries which arise from the interaction of the basis 
functions from the subspaces W^ and We if W^ is mapped. We define the matrix occupancy 
of a basis Bi = (J eeI Be as 

nnz( J B I ) = nnz(k,f). (4.11) 
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In order to estimate nnz(k, £) we employ upper bounds Lj, % — 1, . . . , d, for the Lipschitz- 
constants of S, cf. Figure [3j We obtain 
Proposition 4.3: 

d r L ■ 2- k *+ 1 -( k '=°)" 



nnz(k,£) < \By\ \\ 



2-4+i-(4=o) 



(4.12) 



Proof. Since we have used upper bounds for the Lipschitz constants, one mapped box 
has at most the extension Lj-2~ kl+1- ( ki=0 ' ) in the i th dimension. Consequently, its support 
intersects with at most 

~L ■ 2- k * +1 -( k *= )" 



2~li+l-(li=Q) 



supports of basis functions from We,. 



□ 




Figure 3: Model for the matrix occupancy in 2D. Shaded and colorless (white) show the 
function values (±|<^|), thicker black lines the support boundaries. 



Remark 4.4: Numerical experiments suggest that the above bound approximates the 



matrix occupancy pretty well. However, it could be improved: (3.21 ) shows that a matrix 



entry could still be zero even if supp(y?i) and supp(P(fj) intersect. This is e.g. the case if 
supp(Py?j) is included in a subset of supp(y9j), where <fi is constant (i.e. does not change 
sign). The property ||-P/|| L i = ||/||^i for / > and positivity (see [21]) of P imply 
Pij = 0, since \\<p~j\\ L1 = \\<Pj\\ L i- 



An asymptotic estimate. Let us examine nnz(k, £) for k 
(N, 0, . . . , 0). By taking all Lipschitz-constants Lj = 1 we get 



(0,...,0,N) and 



nnz(k,f) > 2 



2N 
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since \B^\ = 2 N 1 and the image of each basis function from B^ intersects with each basis 
function from Bi. Since \B^\ w N d ~ 1 2 N , we get 

2 2N < nnz(B N ) < N 2d ' 2 2 2N . (4.13) 

The exponential term dominates the polynomial one for large N , so asymptotically we 
will not get a sparse matrix. 

Does this affect the calculations regarding efficiency made above? As already men- 
tioned, the error of Ulam's method is e = (D(2~ M ) while its cost is 2 dM = 0(s~ d ). As- 
suming that the Sparse Ulam method has the same error e = 0(N d ~ 1 2~ N ), its worst-case 
cost IS 

0^-232*) < £ -2 N 4d-4 < £ -2 log ( £ -l)4d- 4) 

where we used JSf d ~ l 2~ N < 2~ N I 2 , which leads to log(e) < —N. Clearly, this means - 



similarily to subsection 4.1 - partially overcoming the curse of dimensionality. Even in 
the most optimistic case, ie. the costs are of 0{2 2N ), we have at least 0(e~ 2 N 2d ~ 2 ) costs, 
so the sparse-Ulam-method is efficienter than Ulam's, only if d > 3. 



5 Numerical examples 
5.1 A 3d expanding map 

We compare both methods by approximating the invariant density of a simple three 
dimensonal map. Let Si : [0, 1] — > [0, 1] be given by 

l-2\x- 1/2|, 

f 2x/(l-x), x < 1/3 
\ (l-x)/(2x), else, 

f 2x/{l -x 2 ), x<V2-l 
\ (l-x 2 )/(2x), else, 

and S : [0, l] 3 — > [0, l] 3 be the tensor product map 

S(x) = {Si(xi), S 2 (x 2 ), S 3 (x 3 )) T , 

where x = {x\, x 2 , x 3 ) T . This map is expanding and its unique invariant density is given 
by ' 

H{X) = n(l + x 2 )(l + x 2 y 

(cf. 

We approximate h by Ulam's method on an equipartition of 2 3M boxes for M = 4,5,6 
as well as by the Sparse Ulam method on levels N = 4,5, 6. Figure [4] shows the L 1 - 
error for both methods in dependence of the number of sample points (left) as well as the 
number of index computations along these curves (right). While the Sparse Ulam method 
requires almost three orders of magnitude fewer sample points than Ulam's method, the 



Si(x) = 
S 2 {x) = 

S 3 (x) = 
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20 25 30 

log 2 (# sample points) 



35 



20 22 24 26 28 30 32 34 36 
log 2 (# projections) 



Figure 4: Left: L 1 -error of the approximate invariant density in dependence on the num- 
ber of sample points for levels N, M = 4, 5, 6. Right: Corresponding number of index 
computations. 

number of index computations is roughly comparable. This is in good agreement with 
our theoretical considerations in sections I4.2.1I and I4.2.21 

In Figure [H] we show the dependence of the L 1 -error on the number of nonzeros in the 
transition matrices for levels M, N = 3, . . . , 6. Again, the Sparse Ulam method is ahead 
of Ulam's method by almost an order of magnitude. 

10" 1 | , — , — , — , — 




10 3 10 4 10 5 10 6 10 7 

# flops(A- x) 

Figure 5: L 1 -error of the approximate invariant densities in dependence on the number 
of nonzeros in the transition matrices. 

5.2 A 4d conservative map 

In a second numerical experiment, we approximate a few dominant eigenfunctions of the 
transfer operator for an area preserving map. Since the information on almost invariant 
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sets does not change [T7] (but the eigenproblem becomes easier to solve) we here consider 
the symmetrized transition matrix |(P + P T ), cf. also |22j . 
Consider the so called standard map S p : [0, l] 2 — ► [0, l] 2 , 

(xi, X2) T i— > (xi + X2 + psin(27ra;i) + 0.5, 0:2 + psin (27ra;i)) T mod 1, 

where < p < 1 is a parameter. This map is area preserving, i.e. the Lebesgue measure 
is invariant w.r.t. S p . Figure [6] shows approximations of the eigenfunctions at the second 
largest eigenvalue of S p for p = 0.3 (left) and p = 0.6 (right) computed via Ulam's method 
on an equipartition of 2 2 ' 6 boxes (i.e. for M = 6). 




x i 



Figure 6: Eigenfunction of the symmetrized transition matrix at the second largest eigen- 
value for the standard map. Left: p = 0.3, A2 = 0.97, right: p = 0.6, A2 = 0.93. 

We now define S : [0, l] 4 -> [0, l] 4 by 

S = S Pl ® S P2 , 

with pi = 0.3 and p 2 = 0.6. Note that the eigenfunctions of S are tensor products of 
the eigenfunctions of the S Pi . This is reflected in Figures [7] and [8] where we show the 
eigenfunctions at the two largest eigenvalues, computed by the Sparse Ulam method on 
level N = 8, using 2 24 sample points overall. Clearly, each of these two is a tensor product 
of the (2d-) eigenfunction at the second largest eigenvalue with the (2d-) invariant (i.e. 
constant) density. 

Figure [9] shows an eigenfunction for which both factors of the tensor product are 
non-constant. The resolution of this eigenfunction seems worse than for those with one 
constant factor. In fact, for an approximation of an eigenfunction which is constant 
with respect to, say, x 3 and x 4 it suffices to consider subspaces Wi with £ = (£1, £ 2 ,0,0). 
All other coefficients are zero, the problem reduces to a two-dimensional one and so the 
eigenfunctions are not perturbed by basis functions varying in the £3 and X4 directions. 
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Figure 7: Approximate eigenfunction at A 2 = 0.97. Left: i> 2 (-, •, x%, x±) for fixed X3,£4, 
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